Activity R&D#8357
Conversation
BenchmarksBenchmark execution time: 2026-05-14 09:49:35 Comparing candidate commit 48af466 in PR branch Some scenarios are present only in baseline or only in candidate runs. If you didn't create or remove some scenarios in your branch, this maybe a sign of crashed benchmarks 💥💥💥 Scenarios present only in baseline:
Found 5 performance improvements and 4 performance regressions! Performance is the same for 49 metrics, 14 unstable metrics, 89 known flaky benchmarks, 37 flaky benchmarks without significant changes.
|
8a21358 to
b35deb9
Compare
489f411 to
abb2a10
Compare
…replacement
Introduces a new opt-in approach (DD_TRACE_OTEL_ACTIVITY_INTERCEPTION_ENABLED=true)
that intercepts System.Diagnostics.Activity methods via CallTarget instead of using
the managed ActivityListener. Goals: reduce memory usage by eliminating Activity's
internal tag storage duplication and the ConcurrentDictionary span lookup.
New CallTarget integrations:
- ActivityStartIntegration: intercepts Activity.Start() to create a Span and link
it to the Activity via GetCustomProperty/SetCustomProperty ('__dd_span__' key)
- ActivityStopIntegration: intercepts Activity.Stop() to finish the Span with
correct timing, extracting links/events/status at stop time
- ActivityAddTagStringIntegration: intercepts AddTag(string, string?) to write
directly to the Span and skip Activity's internal tag list
- ActivityAddTagObjectIntegration: intercepts AddTag(string, object?) similarly
- ActivitySetTagIntegration: intercepts SetTag(string, object?) similarly
- ActivitySetStatusIntegration: intercepts SetStatus() to map OTel status to
Datadog error tags, bypassing Activity's internal status field
- ActivityDisplayNameIntegration: intercepts set_DisplayName to set Span.ResourceName
Supporting infrastructure:
- ActivityCustomPropertyAccessor<TTarget>: zero-allocation cached delegates for
reading/writing the Scope via Activity's custom property API
- ActivitySourceFilter: shared filter for source names already handled by other
Datadog integrations (mirrors IgnoreActivityHandler.SourcesNames)
Configuration changes:
- Added DD_TRACE_OTEL_ACTIVITY_INTERCEPTION_ENABLED feature flag to TracerSettings,
supported-configurations.yaml, and generated ConfigurationKeys
- Instrumentation.cs: skips managed ActivityListener when interception is enabled
- MutableSettings.cs: keeps OpenTelemetry integration enabled under either mode
Other changes:
- IActivity5: added GetCustomProperty/SetCustomProperty to the duck type interface
- OtlpHelpers: added ExtractLinksAndEventsFromActivity helper for stop integration
- ResourceAttributeProcessorHelper: uses custom property lookup when interception
is enabled, falling back to ConcurrentDictionary for the legacy listener path
All integrations registered in InstrumentationDefinitions.g.cs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 1 — Fix IsAllDataRequested: ActivityStartIntegration.CreateAndLinkScope now sets activity5.IsAllDataRequested = true after linking the Span. The managed ActivityListener did this implicitly via its Sample callback returning AllData; since we skip the listener when interception is enabled, we must set it explicitly so user code guarded by `if (activity.IsAllDataRequested)` runs. Phase 2 — Getter interceptions (redirect reads to the Span): The setter integrations use skipMethodBody to bypass Activity's internal storage, which means reading from the Activity would return stale/empty values. Five new getter integrations restore correct observable state by reading from the linked Span: - ActivityDisplayNameGetterIntegration: get_DisplayName → span.ResourceName ?? span.OperationName - ActivityStatusGetterIntegration: get_Status → ActivityStatusCode reconstructed from span's "otel.status_code" tag (Enum.ToObject handles the foreign enum conversion) - ActivityStatusDescriptionGetterIntegration: get_StatusDescription → span's "otel.status_description" tag - ActivityTagsGetterIntegration: get_Tags → all span string tags enumerated via ITags.EnumerateTags into List<KVP<string,string?>> - ActivityTagObjectsGetterIntegration: get_TagObjects → same but boxed as object?, matching Activity.TagObjects' IEnumerable<KVP<string,object?>> contract Note on TagObjects: numeric tags set via SetMetric are not reflected since the Span stores them separately from string tags. This is an acceptable R&D limitation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add the 5 new Activity getter integration classes to the GetIntegrationId switch in the generated InstrumentationDefinitions file so they are correctly mapped to IntegrationId.OpenTelemetry at runtime. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add SubmitsTracesWithInterception to both NetActivitySdkTests and OpenTelemetrySdkTests. Each variant enables DD_TRACE_OTEL_ACTIVITY_INTERCEPTION_ENABLED instead of DD_TRACE_OTEL_ENABLED and shares the same snapshot file as the existing SubmitsTraces test, asserting that the CallTarget-based interception approach produces output identical to the managed ActivityListener approach. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the global _cachedResource with a __dd_resource__ custom property set in ResourceAttributeProcessor.OnStart. ActivityStartIntegration reads it back after Activity.Start() returns, before copying activity tags so reserved tags like service.name=ServiceNameOverride can correctly override the resource. Each TracerProvider's processor stashes its own resource on its activities, so apps that build multiple TracerProviders (e.g. one with no service name resource) no longer have the first provider's resource leak across. Move the property-key constants out of the generic accessor into a non-generic ActivityCustomPropertyKeys helper so they can be referenced from non-generic call sites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ActivitySource.StartActivity returns null when no listener is interested in a source, so unsubscribed sources (e.g. an ActivitySource the application creates but never adds to a TracerProvider) never produce activities and our CallTarget intercept on Activity.Start never fires. Re-enable the managed listener in interception mode so the listener's ShouldListenTo keeps those sources alive; ActivityListenerHandler short-circuits its ActivityStarted/ActivityStopped callbacks in interception mode so it doesn't create duplicate Datadog spans alongside the interception path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…erception In the interception path SetStatus is intercepted before the user calls RecordException, so the exception attributes never reach the span via ActivitySetStatusIntegration. Mirror the managed listener behaviour by calling OtlpHelpers.ExtractExceptionAttributes from ActivityStopIntegration when the span is in Error state, so error.type / error.msg / error.stack get populated from the exception event in Activity.Events. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pshot Bring the test back from a span-count check to a full Verify-snapshot assertion. Use a dedicated .Interception snapshot rather than sharing with SubmitsTraces because of one known parentage gap: StartActiveSpan(name, kind, parentTelemetrySpan) creates a child Activity with Activity.Parent = null (OTel passes the parent context as ActivityContext, not as an Activity reference), so the interception path treats it as a remote parent and the child becomes the root of a fresh TraceContext (extra runtime-id tag and Metrics block). This is the same parentage class as the W3C-only-parent gap in NetActivitySdk and is tracked alongside it; the test docstring records it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
OTel's `Tracer.StartActiveSpan(name, kind, parentTelemetrySpan, …)` / `Tracer.StartSpan(name, kind, parentTelemetrySpan, …)` lower the parent TelemetrySpan into an `ActivityContext` (TraceId+SpanId) before calling `ActivitySource.StartActivity`, so by the time our `Activity.Start` intercept fires the in-process parent reference is gone — the child activity has `Activity.Parent == null` despite having a `RawParentSpanId`. Previously this fell through to the remote-parent branch, creating a fresh `TraceContext` for the child; the child became a local trace root with an extra `runtime-id` tag, an independent sampling decision, and shipped as its own segment. Fix: instrument the two `TelemetrySpan`-parent overloads themselves. `OnMethodBegin` resolves the parent's Datadog `Scope` (duck-cast into the `Activity` field on `TelemetrySpan` via [DuckField], then read `__dd_span__` via the existing `IActivity5` duck type) and pushes it on a thread-local stack. `ActivityStartIntegration.CreateAndLinkScope` peeks that stack only on the cold path that already failed the `Activity.Parent` lookup, so common activities (root spans, normal in-process children) read nothing extra. The OTel-API integration's `OnMethodEnd` pops, gated on a sentinel marker so a config flip mid-call doesn't leak a stack entry. ThreadStatic was chosen over AsyncLocal because the chain `OTel-API → ActivitySource.StartActivity → Activity.Start → ActivityStartIntegration.OnMethodEnd` is fully synchronous on the same thread, so AsyncLocal's execution-context propagation buys nothing. After the fix the OpenTelemetrySdk interception snapshot is byte-identical to the managed-listener snapshot, so the dedicated `.Interception` snapshot is removed and the test shares `OpenTelemetrySdkTests.verified.txt`. The `SetParentId(traceId, spanId)` case in NetActivitySdk remains a deliberate remote-parent path (the API itself is context-only) and keeps its own `.Interception` snapshot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`Datadog.Trace.OpenTelemetry.Sdk.Initialize()` swaps OTel's no-op default text-map propagator for a `CompositeTextMapPropagator` containing `TraceContextPropagator + BaggagePropagator`. It was gated on `IsActivityListenerEnabled` only, so in interception mode the default propagator stayed as the no-op — meaning user calls to `DefaultTextMapPropagator.Inject(...)` silently wrote nothing even when `OpenTelemetry.Baggage.Current` was populated. Same class of bug as the `ActivityListener.Initialize()` gate we already fixed; same fix. Symptom: `NetActivitySdkTests`'s `RunOpenTelemetryApiInject` sets baggage, injects via the propagator into a headers dict, then stamps the resulting `baggage` HTTP header back as a span tag. In interception mode the tag ended up null and was dropped from the snapshot. After the fix the tag appears as `key=value` (added to the dedicated `.Interception` snapshot to match listener-path output, modulo the deliberate `SetParentId(traceId, spanId)` remote-parent divergence). Verified across the full Windows TFM matrix: net48, netcoreapp3.1, net6.0, net7.0, net8.0, net9.0, net10.0 — all pass both `OpenTelemetrySdkTests.SubmitsTracesWithInterception` and `NetActivitySdkTests.SubmitsTracesWithInterception`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without this, every Activity.SetTag / AddTag / set_DisplayName / SetStatus
/ getter call across the host process — for users who haven't opted into
interception — would still hit Activity.GetCustomProperty("__dd_span__")
in the integration's OnMethodBegin. That's a Dictionary lookup on every
intercepted call, paid by every modern .NET 6+ process whose JIT applies
our CallTarget rewrites.
Add `if (!IsActivityInterceptionEnabled) return GetDefault();` at the top
of each Activity-direct intercept (ActivityStartIntegration and
ActivityStopIntegration already had it). Brings non-interception per-call
cost down to a single bool field read.
Verified across the full Windows TFM matrix that interception tests
(SubmitsTracesWithInterception) still pass on net48, netcoreapp3.1,
net6.0, net7.0, net8.0, net9.0, net10.0; and listener-path tests
(SubmitsTraces, SubmitsTracesWithActivitySource) still pass on net6.0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7e70a43 to
7832b66
Compare
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8357) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8357) - mean (73ms) : 70, 75
master - mean (74ms) : 70, 78
section Bailout
This PR (8357) - mean (76ms) : 74, 78
master - mean (77ms) : 75, 79
section CallTarget+Inlining+NGEN
This PR (8357) - mean (1,107ms) : 1049, 1165
master - mean (1,103ms) : 1043, 1164
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8357) - mean (113ms) : 110, 116
master - mean (114ms) : 109, 120
section Bailout
This PR (8357) - mean (114ms) : 111, 116
master - mean (114ms) : 110, 117
section CallTarget+Inlining+NGEN
This PR (8357) - mean (789ms) : 767, 811
master - mean (793ms) : 762, 823
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8357) - mean (101ms) : 97, 105
master - mean (102ms) : 97, 108
section Bailout
This PR (8357) - mean (104ms) : 99, 109
master - mean (102ms) : 100, 104
section CallTarget+Inlining+NGEN
This PR (8357) - mean (944ms) : 914, 975
master - mean (947ms) : 902, 991
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8357) - mean (102ms) : 95, 108
master - mean (101ms) : 96, 107
section Bailout
This PR (8357) - mean (100ms) : 97, 102
master - mean (103ms) : 98, 108
section CallTarget+Inlining+NGEN
This PR (8357) - mean (821ms) : 785, 857
master - mean (821ms) : 778, 864
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8357) - mean (200ms) : 195, 206
master - mean (199ms) : 193, 205
section Bailout
This PR (8357) - mean (203ms) : 196, 210
master - mean (202ms) : 199, 206
section CallTarget+Inlining+NGEN
This PR (8357) - mean (1,205ms) : 1148, 1262
master - mean (1,197ms) : 1153, 1240
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8357) - mean (287ms) : 281, 294
master - mean (287ms) : 278, 296
section Bailout
This PR (8357) - mean (287ms) : 281, 294
master - mean (287ms) : 281, 294
section CallTarget+Inlining+NGEN
This PR (8357) - mean (956ms) : 938, 975
master - mean (961ms) : 945, 978
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8357) - mean (280ms) : 272, 289
master - mean (276ms) : 268, 285
section Bailout
This PR (8357) - mean (281ms) : 276, 285
master - mean (278ms) : 272, 284
section CallTarget+Inlining+NGEN
This PR (8357) - mean (1,159ms) : 1123, 1195
master - mean (1,157ms) : 1125, 1190
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8357) - mean (278ms) : 271, 284
master - mean (279ms) : 274, 285
section Bailout
This PR (8357) - mean (278ms) : 271, 286
master - mean (279ms) : 275, 284
section CallTarget+Inlining+NGEN
This PR (8357) - mean (1,037ms) : 998, 1076
master - mean (1,037ms) : 995, 1078
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Summary of changes
Reason for change
Implementation details
Test coverage
Other details